skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Carmona, Rene"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. The classical Kuramoto model is studied in the setting of an infinite horizon mean field game. The system is shown to exhibit both syn- chronization and phase transition. Incoherence below a critical value of the interaction parameter is demonstrated by the stability of the uni- form distribution. Above this value, the game bifurcates and develops self-organizing time homogeneous Nash equilibria. As interactions get stronger, these stationary solutions become fully synchronized. Results are proved by an amalgam of techniques from nonlinear partial dif- ferential equations, viscosity solutions, stochastic optimal control and stochastic processes. ARTICLE HISTORY Received 23 October 2022 Accepted 17 June 2023 KEYWORDS Mean field games; Kuramoto model; synchronization; viscosity solutions 2020 MATHEMATICS SUBJECT CLASSIFICATION 35Q89; 35D40; 39N80; 91A16; 92B25 1. Introduction Originally motivated by systems of chemical and biological oscillators, the classical Kuramoto model [1] has found an amazing range of applications from neuroscience to Josephson junctions in superconductors, and has become a key mathematical model to describe self organization in complex systems. These autonomous oscillators are coupled through a nonlinear interaction term which plays a central role in the long time behavior of the system. While the system is unsynchronized when this term is not sufficiently strong, fascinatingly they exhibit an abrupt transition to self organization above a critical value of the interaction parameter. Synchronization is an emergent property that occurs in a broad range of complex systems such as neural signals, heart beats, fire-fly lights and circadian rhythms, and the Kuramoto dynamical system is widely used as the main phenomenological model. Expository papers [2, 3] and the references therein provide an excellent introduction to the model and its applications. The analysis of the coupled Kuramoto oscillators through a mean field game formalism is first explored by [4, 5] proving bifurcation from incoherence to coordination by a formal linearization and a spectral argument. [6] further develops this analysis in their application to a jet-lag recovery model. We follow these pioneering studies and analyze the Kuramoto model as a discounted infinite horizon stochastic game in the limit when the number of oscillators goes to infinity. We treat the system of oscillators as an infinite particle system, but instead of positing the dynamics of the particles, we let the individual particles endogenously determine their behaviors by minimizing a cost functional and hopefully, settling in a Nash equilibrium. Once the search for equilibrium is recast in this way, equilibria are given by solutions of nonlinear systems. Analytically, they are characterized by a backward dynamic CONTACT H. Mete Soner soner@princeton.edu Department of Operations Research and Financial Engineering, Prince- ton University, Princeton, NJ, 08540, USA. © 2023 Taylor & Francis Group, LLC 
    more » « less
  2. We develop a general reinforcement learning framework for mean field control (MFC) problems. Such problems arise for instance as the limit of collaborative multi-agent control problems when the number of agents is very large. The asymptotic problem can be phrased as the optimal control of a non-linear dynamics. This can also be viewed as a Markov decision process (MDP) but the key difference with the usual RL setup is that the dynamics and the reward now depend on the state's probability distribution itself. Alternatively, it can be recast as a MDP on the Wasserstein space of measures. In this work, we introduce generic model-free algorithms based on the state-action value function at the mean field level and we prove convergence for a prototypical Q-learning method. We then implement an actor-critic method and report numerical results on two archetypal problems: a finite space model motivated by a cyber security application and a continuous space model motivated by an application to swarm motion. 
    more » « less
  3. We investigate reinforcement learning for mean field control problems in discrete time, which can be viewed as Markov decision processes for a large number of exchangeable agents interacting in a mean field manner. Such problems arise, for instance when a large number of robots communicate through a central unit dispatching the optimal policy computed by minimizing the overall social cost. An approximate solution is obtained by learning the optimal policy of a generic agent interacting with the statistical distribution of the states of the other agents. We prove rigorously the convergence of exact and model-free policy gradient methods in a mean-field linear-quadratic setting. We also provide graphical evidence of the convergence based on implementations of our algorithms. 
    more » « less
  4. We introduce and investigate certain N player dynamic games on the line and in the plane that admit Coulomb gas dynamics as a Nash equilibrium. Most significantly, we find that the universal local limit of the equilibrium is sensitive to the chosen model of player information in one dimension but not in two dimensions. We also find that players can achieve game theoretic symmetry through selfish behavior despite non-exchangeability of states, which allows us to establish strong localized convergence of the N-Nash systems to the expected mean field equations against locally optimal player ensembles, i.e., those exhibiting the same local limit as the Nash-optimal ensemble. In one dimension, this convergence notably features a nonlocal-to-local transition in the population dependence of the N-Nash system. 
    more » « less
  5. A mean field game is proposed for the synchronization of oscillators facing conflicting objectives. Our motivation is to offer an alternative to recent attempts to use dynamical systems to illustrate some of the idiosyncrasies of jet lag recovery. Our analysis is driven by two goals: (1) to understand the long time behavior of the oscillators when an individual remains in the same time zone, and (2) to quantify the costs from jet lag recovery when the individual has traveled across time zones. Finite difference schemes are used to find numerical approximations to the mean field game solutions. They are benchmarked against explicit solutions derived for a special case. Numerical results are presented and conjectures are formulated. The numerics suggest that the cost the oscillators accrue while recovering is larger for eastward travel which is consistent with the widely admitted wisdom that jet lag is worse after traveling east than west. 
    more » « less
  6. The price of anarchy, originally introduced to quantify the inefficiency of selfish behavior in routing games, is extended to mean field games. The price of anarchy is defined as the ratio of a worst case social cost computed for a mean field game equilibrium to the optimal social cost as computed by a central planner. We illustrate properties of such a price of anarchy on linear quadratic extended mean field games, for which explicit computations are possible. A sufficient and necessary condition to have no price of anarchy is presented . Various asymptotic behaviors of the price of anarchy are proved for limiting behaviors of the coefficients in the model and numerics are presented . 
    more » « less
  7. We use the recently developed probabilistic analysis of mean field games with finitely many states in the weak formulation, to set-up a principal / agent contract theory model where the principal faces a large population of agents interacting in a mean field manner. We reduce the problem to the optimal control of dynamics of the McKean-Vlasov type, and we solve this problem explicitly in a special case reminiscent of the linear - quadratic mean field game models. The paper concludes with a numerical example demonstrating the power of the results when applied to a simple example of epidemic containment. 
    more » « less
  8. We develop a probabilistic approach to continuous-time finite state mean field games. Based on an alternative description of continuous-time Markov chain by means of semimartingale and the weak formulation of stochastic optimal control, our approach not only allows us to tackle the mean field of states and the mean field of control in the same time, but also extend the strategy set of players from Markov strategies to closed-loop strategies. We show the existence and uniqueness of Nash equilibrium for the mean field game, as well as how the equilibrium of mean field game consists of an approximative Nash equilibrium for the game with finite number of players under different assumptions of structure and regularity on the cost functions and transition rate between states. 
    more » « less
  9. This project investigates numerical methods for solving fully coupled forward-backward stochastic differential equations (FBSDEs) of McKean-Vlasov type. Having numerical solvers for such mean field FBSDEs is of interest because of the potential application of these equations to optimization problems over a large population, say for instance mean field games (MFG) and optimal mean field control problems . Theory for this kind of problems has met with great success since the early works on mean field games by Lasry and Lions, see [29], and by Huang, Caines, and Malhame, see [26]. Generally speaking, the purpose is to understand the continuum limit of optimizers or of equilibria (say in Nash sense) as the number of underlying players tends to infinity. When approached from the probabilistic viewpoint, solutions to these control problems (or games) can be described by coupled mean field FBSDEs, meaning that the coefficients depend upon the own marginal laws of the solution. In this note, we detail two methods for solving such FBSDEs which we implement and apply to five benchmark problems . The first method uses a tree structure to represent the pathwise laws of the solution, whereas the second method uses a grid discretization to represent the time marginal laws of the solutions. Both are based on a Picard scheme; importantly, we combine each of them with a generic continuation method that permits to extend the time horizon (or equivalently the coupling strength between the two equations) for which the Picard iteration converges. 
    more » « less